# Edge device deployment

PP OCRv3 Mobile Rec
Apache-2.0
PP-OCRv3_mobile_rec is a lightweight text line recognition model developed by the PaddleOCR team. It uses the SVTR algorithm and supports Chinese and English recognition, especially focusing on Chinese scenarios.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
200
0
Holo1 7B GGUF
Apache-2.0
The Holo1-7B GGUF model is part of the Surfer-H system and is suitable for multimodal tasks such as visual document retrieval. It is particularly good at web page interaction and network monitoring, and can achieve high accuracy at a low cost.
Image-to-Text Transformers English
H
Mungert
663
0
Llama 3.1 Nemotron Nano VL 8B V1
Other
Llama-3.1-Nemotron-Nano-VL-8B-V1 is an advanced document intelligent vision-language model that can query and summarize images and videos, and supports multi-environment deployment.
Image-to-Text Transformers
L
nvidia
1,092
66
Llama 3.1 Nemotron Nano 4B V1.1 GGUF
Other
Llama-3.1-Nemotron-Nano-4B-v1.1 is a large language model optimized based on Llama 3.1, achieving a good balance between accuracy and efficiency. It is suitable for various scenarios such as AI agents and chatbots.
Large Language Model Transformers English
L
Mungert
2,177
1
Acemath RL Nemotron 7B GGUF
Other
AceMath-RL-Nemotron-7B is a mathematical reasoning model trained entirely through reinforcement learning. It is trained based on Deepseek-R1-Distilled-Qwen-7B and performs excellently in mathematical reasoning tasks. It also has certain generalization ability in coding tasks.
Large Language Model Transformers English
A
Mungert
633
1
Dfine Large Obj365
Apache-2.0
D-FINE is a powerful real-time object detector that achieves exceptional localization accuracy by redefining the bounding box regression task in DETR models.
Object Detection Transformers English
D
ustc-community
785
2
Dfine Medium Obj2coco
Apache-2.0
D-FINE is a real-time object detection model that achieves exceptional localization accuracy by redefining the bounding box regression task.
Object Detection Transformers English
D
ustc-community
3,610
4
Qwen2.5 VL 3B Instruct GGUF
Qwen2.5-VL-3B-Instruct is a 3B-parameter multimodal model supporting image-text generation tasks, specifically optimized for vision capabilities in llama.cpp.
Text-to-Image English
Q
Mungert
10.44k
8
Gemma 3 27b It GGUF
GGUF quantized version of Gemma 3 with 27B parameters, supporting image-text interaction tasks
Text-to-Image
G
Mungert
4,034
6
Rtdetr V2 R101vd
Apache-2.0
RT-DETRv2 is an improved real-time object detection model based on the DETR architecture, optimizing detection performance through innovations like selective multi-scale feature extraction and dynamic data augmentation.
Object Detection Transformers English
R
PekingU
1,892
2
Rtdetr V2 R34vd
Apache-2.0
RT-DETRv2 is an improved version of the real-time object detection Transformer model, enhancing performance through multi-scale feature extraction and optimized training strategies.
Object Detection Transformers English
R
PekingU
886
1
Qwen2 Audio 7B GGUF
Apache-2.0
Qwen2-Audio is an advanced small-scale multimodal model that supports audio and text input, enabling voice interaction without relying on speech recognition modules.
Audio-to-Text English
Q
NexaAIDev
5,001
153
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase